Arabic Spoken Language Identification System (ASLIS): A Proposed System to Identifying Modern Standard Arabic (MSA) and Egyptian Dialect
نویسنده
چکیده
There are millions of people in the world speak many languages. To communicate with each other it is necessary to know the language which we use. To do this operation we use language identification system. In general, Automatic Speech Recognition for English and other languages has been the subject of most researches in the last forty years. Arabic language research has been growing very slowly in comparison to English language research. The Arabic language has many different dialects; they must be identified before Automatic Speech Recognition can take place. This paper describes the design and implementation of a new spoken language identification system: Arabic Spoken Language Identification (ASLIS) . It focuses only on two major dialects: Modern Standard Arabic (MSA) and Egyptian. It presents a spoken Arabic identifier using Hidden Markov Models (HMMs), and it is developed using the portable Hidden Markov Model Toolkit (HTK).
منابع مشابه
Spoken Arabic Dialect Identification Using Phonotactic Modeling
The Arabic language is a collection of multiple variants, among which Modern Standard Arabic (MSA) has a special status as the formal written standard language of the media, culture and education across the Arab world. The other variants are informal spoken dialects that are the media of communication for daily life. Arabic dialects differ substantially from MSA and each other in terms of phono...
متن کاملAutomatic Dialect Detection in Arabic Broadcast Speech
In this paper, we investigate different approaches for dialect identification in Arabic broadcast speech. These methods are based on phonetic and lexical features obtained from a speech recognition system, and bottleneck features using the i-vector framework. We studied both generative and discriminative classifiers, and we combined these features using a multi-class Support Vector Machine (SVM...
متن کاملQMDIS: QCRI-MIT Advanced Dialect Identification System
As a continuation of our efforts towards tackling the problem of spoken Dialect Identification (DID) for Arabic languages, we present the QCRI-MIT Advanced Dialect Identification System (QMDIS). QMDIS is an automatic spoken DID system for Dialectal Arabic (DA). In this paper, we report a comprehensive study of the three main components used in the spoken DID task: phonotactic, lexical and acous...
متن کاملConventional Orthography for Dialectal Arabic
Dialectal Arabic (DA) refers to the day-to-day vernaculars spoken in the Arab world. DA lives side-by-side with the official language, Modern Standard Arabic (MSA). DA differs from MSA on all levels of linguistic representation, from phonology and morphology to lexicon and syntax. Unlike MSA, DA has no standard orthography since there are no Arabic dialect academies, nor is there a large edited...
متن کاملNamed Entity Recognition for Dialectal Arabic
To date, majority of research for Arabic Named Entity Recognition (NER) addresses the task for Modern Standard Arabic (MSA) and mainly focuses on the newswire genre. Despite some common characteristics between MSA and Dialectal Arabic (DA), the significant differences between the two language varieties hinder such MSA specific systems from solving NER for Dialectal Arabic. In this paper, we pre...
متن کامل